Skip to content

feat: dynamic ingress control for running sandboxes#2093

Open
levb wants to merge 25 commits intolev-allow-deny-dynamicfrom
lev-ingress-control
Open

feat: dynamic ingress control for running sandboxes#2093
levb wants to merge 25 commits intolev-allow-deny-dynamicfrom
lev-ingress-control

Conversation

@levb
Copy link
Contributor

@levb levb commented Mar 11, 2026

Summary

  • Add ingress control to sandbox network config: port allowlist/denylist (allowPorts/denyPorts), client IP CIDR allowlist/denylist (allowIn/denyIn), and MaskRequestHost header rewriting
  • All fields are dynamically updatable via PUT /sandboxes/{sandboxID}/network alongside existing egress rules, applied atomically with rollback on failure
  • Ingress rules enforced at the orchestrator proxy layer; envd port is always exempt

How it works

  • Allow-wins-over-deny: if a client IP is in both allowIn and denyIn, it is allowed. Same for ports
  • allowIn requires 0.0.0.0/0 in denyIn (consistent with egress), otherwise the API rejects the request — without deny-all, allowIn is a no-op since default is allow-all
  • Client IP extracted from second-to-last X-Forwarded-For entry to avoid spoofing; X-E2B-Client-IP (set by client-proxy, stripped before forwarding) takes priority
  • CIDRs are pre-parsed into net.IPNet on config update, not on each request
  • MaskRequestHost supports a {port} placeholder replaced with the actual port from the routing subdomain
  • Rules survive pause/resume — integration tests verify this for both egress and ingress

Test plan

  • Unit tests: ExtractClientIP header precedence/spoofing, CIDR allow/deny priority, combined update with rollback on egress failure, API validation (port range, CIDR format, allowIn-without-deny-all rejection)
  • Integration tests: port deny/allow, client IP deny/allow with spoofed headers, envd exemption, combined egress+ingress in single PUT, MaskRequestHost with port placeholder, pause/resume preserves rules, clear
    restores access

levb and others added 5 commits March 10, 2026 17:23
Add port-level and client IP-level ingress restrictions that can be
configured at sandbox creation and updated at runtime. Follows the same
allow-wins-over-deny priority semantics as egress control.

Key changes:
- OpenAPI spec: add allowPorts, denyPorts, allowIn, denyIn fields to
  SandboxNetworkConfig (both create and update endpoints)
- Proto: add allowed_ports, denied_ports, allowed_client_cidrs,
  denied_client_cidrs to SandboxNetworkIngressConfig
- API handlers: validate ingress rules (port ranges, CIDR format),
  co-locate with egress validation
- Orchestrator proxy: enforce port and client IP restrictions on the
  hot path with pre-parsed CIDRs stored on sandbox Metadata
- Orchestrator server: ingress updates with rollback support alongside
  egress and end_time updates, eventData threaded through update closures
- Error pages: TemplatedError-based HTML/JSON responses for blocked
  ports and client IPs, matching existing traffic token error pattern
- Envd port (49983) is always exempt from ingress restrictions
- Unit tests for validation, proxy helpers, pre-parsed CIDRs, and
  server update rollback logic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace local derefSlice with shared DerefOrDefault utility
- Use sandbox_network.IsIPOrCIDR for CIDR validation (matches egress)
- Extract validatePortList/validateCIDRList helpers to eliminate copy-paste
- Use GetNetworkIngress() directly in proxy to avoid per-request struct allocation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add integration tests for port and client IP ingress filtering through
the real proxy. Tests cover port deny, port allow-overrides-deny,
client IP deny, client IP allow-overrides-deny, envd exemption from
restrictions, clear restoring access, and pause/resume persistence.
Regenerate integration test API client with new ingress fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Start python HTTP servers inside the sandbox so the proxy connects
immediately instead of retrying on non-listening ports. Close response
bodies in proxyRequest helper and return status code directly. Remove
WithSecure(true) to avoid envd auth complexity in ingress tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
levb and others added 14 commits March 11, 2026 05:08
- Use .GetEgress()/.GetIngress() proto getters instead of direct field access
- Restructure empty if-blocks for port/client IP allow checks
- Use strings.Cut instead of strings.IndexByte (modernize)
- Remove unused extraHeaders param from proxyRequest in integration tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move extractClientIP and ClientIPHeader into shared reverseproxy package.
Fix XFF to use second-to-last entry (GCP LB appends real-client-IP + LB-IP).
Client-proxy now sets X-E2B-Client-IP; orchestrator strips it before sandbox.
Refactor ingress integration tests to table-driven with builder pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Exercises set/update/clear of MaskRequestHost via PUT /network,
including ${PORT} placeholder substitution. Uses a Python echo
server to verify the Host header seen inside the sandbox.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In CI, localhost resolves to ::1 (IPv6 loopback). IPv4-only CIDRs
like 0.0.0.0/0 don't match IPv6 addresses, causing deny tests to
get 200 instead of 403. Add ::/0 and ::/1 + 8000::/1 alongside
the IPv4 CIDRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use single-quote wrapping instead of %q for the Python script to avoid
shell quoting issues, and add WaitForStatus to ensure the server is
ready before running tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Require 0.0.0.0/0 in denyIn when allowIn is set, consistent with egress
validation. Without it, allowIn is a no-op (default is allow-all).

Also rename TestUpdateNetworkConfig to TestUpdateEgressConfig for
consistency with TestUpdateIngressConfig and TestUpdateMaskRequestHost,
and remove step numbers from subtest names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The rollback function captured the request context, which may be
cancelled by the time rollback runs. Use WithoutCancel so cleanup
always completes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace inline rollback loop with shared utils.ApplyAllOrNone, which
correctly passes context.WithoutCancel to rollback functions. Each
closure populates its eventData attributes directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ion test

Remove 4 self-evident unit tests (TestContainsPort,
TestIngressClientCIDRs_NilIngress, TestIngressClientCIDRs_SetNil_Clears,
TestUpdate_IngressOnly). Add TestUpdateCombinedEgressAndIngress to
verify both egress and ingress rules apply in a single PUT.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@levb levb marked this pull request as ready for review March 11, 2026 18:30
@levb
Copy link
Contributor Author

levb commented Mar 11, 2026

@djeebus I don't know why the linter is failing consistently now in the nfs proxy, maybe you can take a look?

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c38bf3b288

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +95 to +98
if !containsPort(ingress.GetAllowedPorts(), port) {
if containsPort(ingress.GetDeniedPorts(), port) {
return nil, reverseproxy.NewErrPortNotAllowed(sandboxId, port)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce allowPorts as a real allowlist

When allowPorts is configured, this logic still allows any port that is not explicitly listed in denyPorts, because it only rejects requests inside the nested if containsPort(denied) branch. That makes allowPorts effectively a no-op unless callers also deny every other port, which contradicts the API contract (allowPorts should restrict ingress to listed ports). For example, allowPorts=[8000] and empty denyPorts still permits traffic to 8001.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. This was an attempt to mimic the egress logic in base. It's probably much more intuitive for the user to treat allowPorts list as exclusive if denyPorts is empty; or maybe a * to express all ports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2/5 we make this simple - allowPorts and denyPorts are mutually exclusive. allowPorts assumes everything else is denied, and denyPorts assumes everything else is allowed.

levb and others added 2 commits March 11, 2026 11:50
A malicious client could send X-E2B-Client-IP in their request and
ExtractClientIP would return the spoofed value. Delete the header
first so the IP is derived from trusted sources (XFF / RemoteAddr).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
levb and others added 3 commits March 12, 2026 06:42
Cherry-pick of e1eee7f from main. GetNetwork() returned the proto
pointer after releasing the read lock, allowing concurrent reads of
Egress/Ingress to race with SetNetworkEgress/SetNetworkIngress writes.
Replace GetNetwork() with GetNetworkEgress/GetNetworkIngress that hold
the lock while accessing sub-fields. Remove unused SetNetwork method.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Client-proxy strips incoming X-E2B-Client-IP and derives the client IP
from trusted sources (XFF/RemoteAddr). The spoofed-IP tests were setting
X-E2B-Client-IP directly, which got overwritten, causing 200 instead of
403. Spoof via X-Forwarded-For instead so the request flows through the
real client-proxy → orchestrator-proxy path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ValentaTomas ValentaTomas removed their request for review March 12, 2026 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants